Method and apparatus for noise suppression

ABSTRACT

An apparatus for noise suppression has a converter for converting an input signal into a frequency-domain, an SNR calculator for determining a signal-to-noise ratio (SNR) using the frequency-domain signal, a spectral gain generator for determining a spectral gain based on the SNR, a spectral gain modification unit for correcting the spectral gain to determine a modified spectral gain, a multiplier for weighting the frequency-domain signal using the modified spectral gain, and an inverse converter for converting the weighted frequency-domain signal into a signal in a time-domain signal. In the apparatus for noise suppression, it is preferable to calculate a weighted noisy speech power spectrum from the power spectrum of noisy speech and the power spectrum of estimated noise thereby determining an SNR.

TECHNICAL FIELD

The present invention relates to a method of and an apparatus forsuppressing noise superposed on a desired speech signal.

BACKGROUND ART

A noise suppressor is an apparatus which suppresses noise superposed ona desired speech signal. A noise suppressor operates to estimate thepower spectrum of a noise component using an input signal that has beentransformed into a frequency-domain signal, and subtracts the estimatednoise power spectrum from the input signal thereby suppressing the noisemixed with the desired speech signal. A noise suppressor can be used tosuppress nonstationary noise by detecting a silent section of speech andupdating the power spectrum of a noise component.

A noise suppressor is described in IEEE TRANSACTIONS ON ACOUSTICS,SPEECH, AND SIGNAL PROCESSING, Vol. 32, No. 6, pp. 1109-1121, DECEMBER1984, (Reference 1). In this paper, the noise suppressor uses atechnique known as a minimum mean-square error short-time spectralamplitude process. FIG. 1 shows the structure of the noise suppressordescribed in Reference 1. A signal including a desired speech signal andnoise mixed therewith will hereinafter be referred to as a noisy speechsignal.

The noise suppressor shown in FIG. 1 comprises input terminal 11, framedecomposition unit 1, windowing unit 2, Fourier transform unit 3, voiceactivity detector 4, noise estimation unit 51, frequency-dependent SNR(signal-to-noise ratio) calculator 6, a-priori SNR estimator 7, spectralgain generator 8, inverse Fourier transform unit 9, frame synthesis unit10, output terminal 12, counter 13, and multiplexed multipliers 16, 17.In the noise suppressor, input terminal 11 is supplied with a noisyspeech signal as a sequence of samples. Samples of the noisy speechsignal are then supplied to frame decomposition unit 1, which dividesthe noisy speech signal into frames with K/2 samples where K representsan even number. The noisy speech signal samples which are divided intoframes are supplied to windowing unit 2 in which they are multiplied bya window function w(t). A signal y _(n)(t) produced by windowing then^(th)-frame of the input signal y_(n)(t) (t=0, 1, . . . , K/2−1) withw(t) is expressed by the following equation:y _(n)(t)=w(t)y _(n)(t)  (1)

In the noise suppressor, successive two frames are generally overlappedand windowed. If it is assumed that 50% of the frame length is used asthe overlap length, then windowing unit 2 outputs y _(n)(t) (t=0, 1, . .. , K−1) expressed by (2), (3):y _(n)(t)=w(t)y _(n−1)(t)  (2)y _(n)(t+K/2)=w(t+K/2)y _(n)(t)  (3)

In the following description, 50% overlap is assumed. A Hanning windowexpressed by equation (4), for example, may be used as w(t):

$\begin{matrix}{{w(t)} = \left\{ \begin{matrix}{{0.5 + {0.5{\cos\left( \frac{\pi\left( {k - {K/2}} \right)}{K/2} \right)}}},} & {0 \leq t < K} \\{0,} & \text{otherwise}\end{matrix} \right.} & (4)\end{matrix}$

The windowed output y _(n)(t) is supplied to Fourier transform unit 3,which converts the windowed output y _(n)(t) into a noisy speechspectrum Y_(n)(k). The noisy speech spectrum Y_(n)(k) is separated intoa phase and an amplitude. The noisy speech phase spectrum arg Y_(n)(k)is supplied to inverse Fourier transform unit 9, and the spectralamplitude of noisy speech |Y_(n)(k)| is supplied to voice activitydetector 4, multiplexed multiplier 16, and multiplexed multiplier 17.

Voice activity detector 4 determines whether there is speech or notbased on the spectral amplitude of noisy speech |Y_(n)(k)|, andtransmits a voice activity detection flag that is set in accordance withthe determined result to noise estimation unit 51. Multiplexedmultiplier 17 calculates a noisy speech power spectrum using thesupplied spectral amplitude of noisy speech |Y_(n)(k)|, and provides thecalculated noisy speech power spectrum to noise estimation unit 51 andfrequency-dependent SNR calculator 6.

Noise estimation unit 51 estimates a power spectrum of the noise usingthe voice activity detection flag, the noisy speech power spectrum, anda count value supplied from counter 13, and transmits the estimatedpower spectrum to frequency-dependent SNR calculator 6 as an estimatednoise power spectrum. Frequency-dependent SNR calculator 6 calculates anSNR for each frequency by using the noisy speech power spectrum and theestimated noise power spectrum which have been supplied thereto, andsupplies the calculated SNR as an a-posteriori SNR to a-priori SNRestimator 7 and spectral gain generator 8.

A-priori SNR estimator 7 estimates an a-priori SNR using thea-posteriori SNR supplied thereto and a spectral gain supplied fromspectral gain generator 8, and supplies the estimated a-priori SNR asfeedback to spectral gain generator 8.

Spectral gain generator 8 generates a spectral gain using thea-posteriori SNR and the estimated a-priori SNR which are suppliedthereto as inputs, and supplies the spectral gain to a-priori SNRestimator 7 as feedback and also transmits the generated noise spectralgain to multiplexed multiplier 16.

Multiplexed multiplier 16 weights the spectral amplitude of noisy speech|Y_(n)(k)| supplied from Fourier transform unit 3 with the spectral gainG _(n)(k) supplied from spectral gain generator 8, thus determining aspectral amplitude of the enhanced speech | X _(n)(k)|, and transmitsthe spectral amplitude of the enhanced speech | X _(n)(k)| to inverseFourier transform unit 9. The spectral amplitude of the enhanced speech| X _(n)(k)| is expressed by equation (5):| X _(n)(k)|= G _(n)(k)|Y _(n)(k)|  (5)

Inverse Fourier transform unit 9 multiplies the spectral amplitude ofthe enhanced speech | X _(n)(k)| supplied from multiplexed multiplier 16by the noisy speech phase spectrum arg Y_(n)(k) supplied from Fouriertransform unit 3 by each other, thus determining enhanced speech X_(n)(k). That is, inverse Fourier transform unit 9 carries out acalculation according to equation (6):X _(n)(k)=| X _(n)(k)|arg Y _(n)(k)  (6)

Inverse Fourier transform unit 9 performs an inverse Fourier transformon the produced enhanced speech X _(n)(k), producing a time-domainsequence of samples x _(n)(t) (t=0, 1, . . . , K−1) where one frame ismade up of K samples, and transmits the time-domain samples x _(n)(t) toframe synthesis unit 10. Frame synthesis unit 10 takes out K/2 samplesfrom adjacent two frames of x _(n)(t), and overlaps the K/2 samples,producing enhanced speech {circumflex over (x)}_(n)(t) according toequation (7). The produced enhanced speech {circumflex over (x)}_(n)(t)(t=0, 1, . . . , K−1) is transmitted as an output from frame synthesisunit 10 to output terminal 12.{circumflex over (x)} _(n)(t)= x _(n−1)(t+K/2)+ x _(n)(t)  (7)

Reference 1 discloses no details about how to implement voice activitydetector 4 included in the noise suppressor shown in FIG. 1. However,one example of the voice activity detector that can be used in the noisesuppressor is available in “Proceedings of National Convention of theAcoustical Society of Japan, March 2000, pages 321-322 (Reference 2).”The voice activity detector shown in Reference 2 will be described belowas a conventional implemented example of voice activity detector 4. Asshown in FIG. 2, voice activity detector 4 comprises threshold memory401, comparator 402, multiplier 404, logarithmic calculator 405, powercalculator 406, weighted adder 407, weight memory 408, and NOT circuit409.

In voice activity detector 4, the spectral amplitude of noisy speechsupplied from the Fourier transform unit 3 (FIG. 1) is supplied to powercalculator 406. Power calculator 406 calculates the sum of powers|Y_(n)(k)|² of the spectral amplitude of noisy speech from k=0 to K−1,and transmits the calculated sum to logarithmic calculator 405.Logarithmic calculator 405 determines a logarithm of the supplied noisyspeech spectrum power, and supplies the logarithm to multiplier 404.Multiplier 404 multiplies the supplied logarithm by a constant todetermine a noisy speech power Q_(n), and supplies the noisy speechpower Q_(n) to comparator 402 and weighted adder 407. Specifically,noisy speech power Q_(n) in the n^(th)-frame is expressed by thefollowing equation:

$\begin{matrix}{Q_{n} = {10{\log_{10}\left( {\sum\limits_{k = 0}^{K - 1}{{Y_{n}(k)}}^{2}} \right)}}} & (8)\end{matrix}$

The voice activity detector disclosed in Reference 2 determines Q_(n)according to equation (9), using time-domain samples y _(n)(t).

$\begin{matrix}{Q_{n} = {10{\log_{10}\left( {\sum\limits_{t = 0}^{K - 1}{{\overset{\_}{y}}_{n}^{2}(t)}} \right)}}} & (9)\end{matrix}$

As described in “Digital Signal Processing”, 1985, Corona, pages 75-76(Reference 3), it is known that the equations (8) and (9) are equivalentby the Parseval's Theorem.

Comparator 402 is supplied with a threshold value TH_(n) from thresholdmemory 401. Comparator 402 compares the output from multiplier 404 withthe threshold value TH_(n). If TH_(n)>Q_(n), then comparator 402 outputs“1” representing a speech section, and if TH_(n)≦Q_(n), then comparator402 outputs “0” representing a silent section, as a voice activitydetection flag. The output from comparator 402 is used as the voiceactivity detection flag, and is also supplied to NOT circuit 409. NOTcircuit 409 supplies its output as weighted adder control signal 905 forweighted adder 407. Weighted adder 407 is also supplied with thresholdvalue 902 from threshold memory 401 and weight 903 from weight memory408.

Weighted adder 407 selectively updates threshold value 902 supplied fromthreshold memory 401 based on weighted adder control signal 905, andsupplies updated threshold value 904 as feedback to threshold memory401. The updated threshold value TH_(n) is determined by performingweighted addition of a threshold value TH_(n−1) and noisy speech power901 using weight 903 from weight memory 408. The updated threshold valueTH_(n) is calculated only when weighted adder control signal 905 whichis the output from NOT circuit 409 is equal to “1”, i.e., only during asilent section. Updated threshold value 904 thus updated is supplied asfeedback to threshold memory 401.

As shown in FIG. 3, power calculator 406 has demultiplexer 4061, Kmultipliers 4062 ₀ to 4062 _(K−1), and adder 4063. The multiplexedspectral amplitude of noisy speech supplied from Fourier transform unit3 (FIG. 1) is separated by demultiplexer 4061 into frequency-dependent Ksamples, which are supplied respectively to multipliers 4062 ₀ to 4062_(K−1). Multipliers 4062 ₀ to 4062 _(K−1) square the supplied inputsignals, respectively, and transmit the squared signals to adder 4063,which determines the sum of the input signals and outputs the determinedsum.

As shown in FIG. 4, weighted adder 407 has multipliers 4071, 4073,constant multiplier 4075, and adders 4072, 4074. Weighted adder 407 issupplied with noisy speech power 901 from multiplier 404 (FIG. 2),threshold value 902 from threshold memory 401 (FIG. 2), weight 903 fromweight memory 408 (FIG. 2), and weighted adder control signal 905 fromNOT circuit 409 (FIG. 2) as inputs thereto. Weight 903 having a value βis transmitted to constant multiplier 4075 and multiplier 4073. Constantmultiplier 4075 multiplies the input signal by −1 to produce a value −β,and transmits the value −β to adder 4074, which is supplied also with 1as another input. Adder 4074 thus outputs a sum 1−β, which is suppliedto multiplier 4071. On the other hand, multiplier 4071 multiplies thesum 1−β, by noisy speech power Q_(n) as another input thereto, producinga product (1−β)Q_(n) that is transmitted to adder 4072. Multiplier 4073multiplies the value β supplied as weight 903 by threshold value 902,and transmits a product βTH_(n−1) to adder 4072. Adder 4072 addsβTH_(n−1) and (1−β)Q_(n), and outputs the sum as updated threshold value904. The updated threshold value TH_(n) is calculated only when weightedadder control signal 905 is equal to “1”. That is, weighted adder 407has a function to update TH_(n−1) to determine TH_(n) during a silentsection according to the following equation where β represents the valueof weight 903:

$\begin{matrix}{{TH}_{n} = \left\{ \begin{matrix}{{TH}_{n},} & {{TH}_{n} \geq Q_{n}} \\{{{\beta\;{TH}_{n - 1}} + {\left( {1 - \beta} \right)Q_{n}}},} & {{TH}_{n} < Q_{n}}\end{matrix} \right.} & (10)\end{matrix}$

FIG. 5 shows an example of an arrangement of multiplexed multiplier 17included in the noise suppressor shown in FIG. 1. Multiplexed multiplier17 has K multipliers 1701 ₀ to 1701 _(K−1) demultiplexers 1702, 1703,and multiplexer 1704. In multiplexed multiplier 17, the multiplexedspectral amplitude of noisy speech supplied from Fourier transform unit3 (FIG. 1) is separated by demultiplexers 1702, 1703 intofrequency-dependent K samples, which are supplied respectively tomultipliers 1701 ₀ to 1701 _(K−1). Multipliers 1701 ₀ to 1701 _(K−1)square the supplied input signals, respectively, and transmit thesquared signals to multiplexer 1704, which multiplexes the input signalsand outputs the multiplexed signal as a noisy speech power spectrum.

As shown in FIG. 6, noise estimation unit 51 included in the noisesuppressor shown in FIG. 1 has demultiplexer 502, multiplexer 503, and Kfrequency-dependent noise estimation units 514 ₀ to 514 _(K−1). In noiseestimation unit 51, the voice activity detection flag supplied fromvoice activity detector 4 (FIG. 1) and the count value supplied fromcounter 13 (FIG. 1) are transmitted to frequency-dependent noiseestimation units 514 ₀ to 514 _(K−1). The noisy speech power spectrumsupplied from multiplexed multiplier 17 (FIG. 1) is transmitted todemultiplexer 502. Demultiplexer 502 separates the supplied multiplexednoisy speech power spectrum into K frequency-dependent components, andtransmits the K frequency-dependent components respectively tofrequency-dependent noise estimation units 514 ₀ to 514 _(K−1).Frequency-dependent noise estimation units 514 ₀ to 514 _(K−1) calculatenoise power spectrum components using the noisy speech power spectrumsupplied from demultiplexer 502, and transmit the calculated noise powerspectrum components to multiplexer 503. Calculation of the noise powerspectrum is controlled by the count value and the value of the voiceactivity detection flag and is performed only when predeterminedconditions are satisfied. Multiplexer 503 multiplexes the supplied Knoise power spectrum components, and outputs the multiplexed noise powerspectrum as an estimated noise power spectrum.

FIG. 7 shows an arrangement of each of frequency-dependent noiseestimation units 514 ₀ to 514 _(K−1) included in noise estimation unit51 (FIG. 6). Since frequency-dependent noise estimation units 514 ₀ to514 _(K−1) are identical in arrangement to each other, they areindicated as frequency-dependent noise estimation unit 514 in FIG. 7.The noise estimation algorithm disclosed in Reference 2 serves to updatean estimated noise value in a silent section, and uses instantaneousvalues of estimated noise which are averaged by a recursive filter, asthe estimated noise value. Another noise estimation algorithm isdisclosed in IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, Vol. 6,No. 3, pp. 287-292, MAY 1998 (Reference 4), which states thatinstantaneous values of estimated noise are averaged and used. Reference4 suggests the implementation of an averaging process using atransversal filter, i.e., a filter comprising a shift register, ratherthan a recursive filter. Since both of the above implementations haveequal functions, the process disclosed in Reference 4 will be describedbelow.

Frequency-dependent noise estimation unit 514 has update decision unit521, register length memory 5041, switch 5044, shift register 4045,adder 5046, minimum value selector 5047, divider 5048, and counter 5049.Switch 5044 is supplied with the frequency-dependent noisy speech powerspectrum from demultiplexer 502 (FIG. 6). When switch 5044 closes itscircuit, the frequency-dependent noisy speech power spectrum istransmitted to shift register 5045. In response to a control signalsupplied from update decision unit 521, shift register 5045 shiftsstored values in internal register elements to adjacent registerelements. The length of the shift register 5045 is equal to a valuestored in register length memory 5941. The outputs from all the internalregister elements of shift register 5045 are supplied to adder 5046.Adder 5046 adds the supplied outputs from all the internal registerelements, and transmits the sum to divider 5048.

On the other hand, update decision unit 521 is supplied with the countvalue from counter 13 and the voice activity detection flag from voiceactivity detector 4. Update decision unit 521 outputs “1” at all timesuntil the count value reaches a preset value. After the count valuereaches the preset value, update decision unit 521 outputs “1” when thevoice activity detection flag is “0”, i.e., during a silent section, andoutputs “0” otherwise. Update decision unit 521 transmits its output tocounter 5049, switch 5044, and shift register 5045. Switch 5044 closesits circuit when the signal supplied from update decision unit 521 is“1”, and opens its circuit when the signal supplied from update decisionunit 521 is “0”. Counter 5049 increments its count value when the signalsupplied from update decision unit 521 is “1”, and does not change itscount value when the signal supplied from update decision unit 521 is“0”. Shift register 5045 reads one signal sample supplied from switch5044 and shifts the stored values in the internal register elements tothe adjacent register elements, when the signal supplied from updatedecision unit 521 is “1”.

Minimum value selector 5047 is supplied with the output from counter5049 and the output from register length memory 5941. Minimum valueselector 5047 selects a smaller one of the count value and the registerlength which are supplied thereto, and transmits the selected value todivider 5048. Divider 5048 divides the sum of the frequency-dependentnoisy speech power spectrum supplied from adder 5046 by the smaller oneof the count value and the register length, and outputs the quotient asa frequency-dependent estimated noise power spectrum λ_(n)(k). If thesample values of the frequency-dependent noisy speech power spectrumcomponents stored in shift register 5045 are represented by B_(n)(k)(n=0, 1, . . . , N−1), then the frequency-dependent estimated noisepower spectrum λ_(n)(k) is expressed by equation (11):

$\begin{matrix}{{\lambda_{n}(k)} = {\frac{1}{N}{\sum\limits_{n = 0}^{N - 1}{B_{n}(k)}}}} & (11)\end{matrix}$where N represents a smaller one of the count value and the registerlength. Since the count value monotonously increments from zero,dividing operation is initially performed by using the count value andthen performed by using the register length. Performing dividingoperation by using the register length means determining an averagevalue of the values stored in the shift register. Initially, sincesufficiently many values are not stored in shift register 5045, the sumof frequency-dependent noisy speech power spectrum is divided by thenumber of register elements where values are actually stored. The numberof register elements where values are actually stored is equal to thecount value when the count value is smaller than the register length,and equal to the register length when the count value becomes largerthan the register length.

FIG. 8 shows an arrangement of update decision unit 521. Update decisionunit 521 has NOT circuit 5202, comparator 5203, threshold memory 5204,and OR circuit 5211. In update decision unit 521, the count valuesupplied from counter 13 (FIG. 1) is transmitted to comparator 5203.Comparator 5203 is also supplied with a threshold value output fromthreshold memory 5204. Comparator 5203 compares the supplied count valueand the supplied threshold value with each other. If the count value issmaller than the threshold value, then comparator 5203 transmits “1” toOR circuit 5211, and if the count value is greater than the thresholdvalue, then comparator 5203 transmits “0” to OR circuit 5211. The voiceactivity detection flag supplied to update decision unit 521 istransmitted to NOT circuit 5202, which determines a logical invertedvalue of the input signal and transmits the inverted value to OR circuit5211. Specifically, NOT circuit 5202 transmits “0” to OR circuit 5211 ina speech section where the voice activity detection flag is “1”, andtransmits “1” to OR circuit 5211 in a silent section where the voiceactivity detection flag is “0”. As a result, OR circuit 5211 outputs “1”during a silent section where the voice activity detection flag is “0”or when the count value is smaller than the threshold value, closing theswitch shown in FIG. 7 and counting up counter 5049.

FIG. 9 shows an example of an arrangement of frequency-dependent SNRcalculator 6 included in the noise suppressor shown in FIG. 1.Frequency-dependent SNR calculator 6 has K dividers 601 ₀ to 601 _(K−1),demultiplexers 602, 603, and a multiplexer 604. In frequency-dependentSNR calculator 6, the noisy speech power spectrum supplied frommultiplexed multiplier 17 (FIG. 1) is transmitted to demultiplexer 602.The estimated noise power spectrum supplied from noise estimation unit51 (FIG. 1) is transmitted to demultiplexer 603. The noisy speech powerspectrum is separated into K samples corresponding to respectivefrequency components by demultiplexer 602, and the K samples aresupplied to respective dividers 601 ₀ to 601 _(K−1). The estimated noisepower spectrum is separated into K samples corresponding to respectivefrequency components by demultiplexer 603, and the K samples aresupplied to respective dividers 601 ₀ to 601 _(K−1). Dividers 601 ₀ to601 _(K−1) divide the supplied noisy speech power spectrum by thesupplied estimated noise power spectrum, thus determiningfrequency-dependent SNR γ_(n)(k) according to equation (12), andtransmit the frequency-dependent SNR γ_(n)(k) to multiplexer 604:

$\begin{matrix}{{\gamma_{n}(k)} = \frac{{{Y_{n}(k)}}^{2}}{\lambda_{n}(k)}} & (12)\end{matrix}$where λ_(n)(k) represents the estimated noise power spectrum.Multiplexer 604 multiplexes the transmitted K frequency-dependent SNRs,and outputs the multiplexed SNR as an a-posteriori SNR.

As shown in FIG. 10, a-priori SNR estimator 7 included in the noisesuppressor shown in FIG. 1 has multiplexed range limitation processor701, a-posteriori SNR memory 702, spectral gain memory 703, multiplexedmultipliers 704, 705, weight memory 706, multiplexed weighted adder 707,and adder 708.

In a-priori SNR estimator 7, the a-posteriori SNRs γ_(n)(k) (k=0, 1, . .. , K−1) supplied from frequency-dependent SNR calculator 6 (FIG. 6) aretransmitted to a-posteriori SNR memory 702 and adder 708. A-posterioriSNR memory 702 stores a-posteriori SNR γ_(n)(k) in the n^(th)-frame andtransmits a-posteriori SNR γ_(n−1)(k) in the (n−1)^(th)-frame tomultiplexed multiplier 705. The spectral gains G _(n)(k) (k=0, 1, . . ., K−1) supplied from spectral gain generator 8 are transmitted tospectral gain memory 703. Spectral gain memory 703 stores spectral gainG _(n)(k) in the n^(th)-frame and transmits spectral gain G _(n−1)(k) inthe (n−1)^(th)-frame to multiplexed multiplier 704. Multiplexedmultiplier 704 squares the supplied spectral gain G _(n−1)(k) todetermine G ² _(n−1)(k) and transmits G ² _(n−1)(k) to multiplexedmultiplier 705. Multiplexed multiplier 705 multiplies G ² _(n−1)(k) andγ_(n−1)(k) for k=0, 1, . . . , K−1 to determine G ² _(n−1)(k)γ_(n−1)(k),and transmits G ² _(n−1)(k)γ_(n−1)(k) as past estimated SNR 922 tomultiplexed weighted adder 707. Multiplexed multipliers 704, 705 areidentical in arrangement to multiplexed multiplier 17 already describedwith reference to FIG. 5 and will not be described here.

The other terminal of adder 708 is supplied with −1, so that the sumγ_(n)(k)−1 is transmitted to multiplexed range limitation processor 701.Multiplexed range limitation processor 701 processes the sum γ_(n)(k)−1supplied from adder 708 with a range limitation operator P[·], andtransmits the result P[γ_(n)(k)−1] as instantaneous estimated SNR 921 tomultiplexed weighted adder 707. P[χ] is defined as (13):

$\begin{matrix}{{P\lbrack x\rbrack} = \left\{ \begin{matrix}{x,} & {x > 0} \\{0,} & \text{otherwise}\end{matrix} \right.} & (13)\end{matrix}$

Multiplexed weighted adder 707 is also supplied with weight 923 fromweight memory 706. Multiplexed weighted adder 707 determines estimateda-priori SNR 924 using instantaneous estimated SNR 921, past estimatedSNR 922, and weight 923, which are supplied thereto. If weight 923 isrepresented by α and estimated a-priori SNR 924 is represented by{circumflex over (ξ)}_(n)(k), then {circumflex over (ξ)}_(n)(k) iscalculated according to equation (14):{circumflex over (ξ)}_(n)(k)=αγ_(n−1)(k) G _(n−1) ²(k)+(1−α)P[γ_(n)(k)−1]  (14)where G ² ⁻¹(k)γ⁻¹(k)=1.

As shown in FIG. 11, above-described multiplexed range limitationprocessor 701 has constant memory 7011, K maximum value selectors 7012 ₀to 7012 _(K−1), demultiplexer 7013, and multiplexer 7014. In multiplexedrange limitation processor 701, demultiplexer 7013 is supplied withγ_(n)(k)−1 from adder 708 (FIG. 10). Demultiplexer 7013 splits thesupplied γ_(n)(k)−1 into K frequency-dependent components and suppliesfrequency-dependent components respectively to maximum value selectors7012 ₀ to 7012 _(K−1), whose other input terminals are supplied withzero from constant memory 7011. Maximum value selectors 7012 ₀ to 7012_(K−1) compare γ_(n)(k)−1 with zero, and transmit larger values tomultiplexer 7014. This maximum value selecting calculation correspondsto the calculation according to equation (13). Multiplexer 7014multiplexes the supplied values and outputs the multiplexed value.

As shown in FIG. 12, multiplexed weighted adder 707 has K weightedadders 7071 ₀ to 7071 _(K−1), demultiplexers 7072, 7074, and multiplexer7075. In multiplexed weighted adder 707, demultiplexer 7072 is suppliedwith P[γ_(n)(k)−1] as instantaneous estimated SNR 921 from multiplexedrange limitation processor 701 (FIG. 10). Demultiplexer 7072 separatesP[γ_(n)(k)−1] into K frequency-dependent components, and transmit thefrequency-dependent components as frequency-dependent instantaneousestimated SNRs 921 ₀ to 921 _(K−1) to respective weighted adders 7071 ₀to 7071 _(K−1). Demultiplexer 7074 is supplied with G ²_(n−1)(k)γ_(n−1)(k) as past estimated SNR 922 from multiplexedmultiplier 705 (FIG. 10). Demultiplexer 7074 separates G ²_(n−1)(k)γ_(n−1)(k) into K frequency-dependent components, and transmitsthe frequency-dependent components as past frequency-dependent estimatedSNRs 922 ₀ to 922 _(K−1) to respective weighted adders 7071 ₀ to 7071_(K−1). Weighted adders 7071 ₀ to 7071 _(K−1) are also supplied withweight 923. Weighted adders 7071 ₀ to 7071 _(K−1) carry out the weightedaddition according to equation (14), and transmit the result asfrequency-dependent estimated a-priori SNRs 924 ₀ to 924 _(K−1) tomultiplexer 7075. Multiplexer 7075 multiplexes frequency-dependentestimated a-priori SNRs 924 ₀ to 924 _(K−1) and outputs the multiplexedresult as estimated a-priori SNR 924. Operation and arrangement of eachof weighted adders 7071 ₀ to 7071 _(K−1) are the same as weighted adder407 already described above with reference to FIG. 4, and will not bedescribed in detail. However, the weighted addition is calculated at alltimes.

FIG. 13 shows an example of an arrangement of spectral gain generator 8included in the noise suppressor shown in FIG. 1. Spectral gaingenerator 8 has K spectral gain search units 801 ₀ to 801 _(K−1),demultiplexers 802, 803, and multiplexer 804. In spectral gain generator8, demultiplexer 802 is supplied with the a-posteriori SNR fromfrequency-dependent SNR calculator 6 (FIG. 1). Demultiplexer 802separates the supplied a-posteriori SNR into K frequency-dependentcomponents and transmits the K frequency-dependent componentsrespectively to spectral gain search units 801 ₀ to 801 _(K−1).Demultiplexer 803 is supplied with the estimated a-priori SNR froma-priori SNR estimator 7 (FIG. 1). Demultiplexer 803 separates thesupplied estimated a-priori SNR into K frequency-dependent componentsand transmits the K frequency-dependent components respectively tospectral gain search units 801 ₀ to 801 _(K−1). Spectral gain searchunits 801 ₀ to 801 _(K−1) search for spectral gains corresponding to thea-posteriori SNR and the estimated a-priori SNR which have beensupplied, and transmit the results to multiplexer 804. Multiplexer 804multiplexes the supplied spectral gains and outputs the multiplexedresult.

FIG. 14 shows an example of an arrangement of spectral gain search units801 ₀ to 801 _(K−1). Since spectral gain search units 801 ₀ to 801_(K−1) are identical in arrangement to each other, they are representedas spectral gain search unit 801 in FIG. 14. Spectral gain search unit801 has spectral gain table 8011 and address converters 8012, 8013. Inspectral gain search unit 801, address converter 8012 is supplied withthe frequency-dependent a-posteriori SNR from demultiplexer 802 (FIG.13). Address converter 8012 converts the supplied frequency-dependenta-posteriori SNR into a corresponding address, and transmits the addressto spectral gain table 8011. Address converter 8013 is supplied with thefrequency-dependent estimated a-priori SNR from demultiplexer 803 (FIG.13). Address converter 8013 converts the supplied frequency-dependentestimated a-priori SNR into a corresponding address, and transmits theaddress to spectral gain table 8011. Spectral gain table 8011 outputsspectral gains which are stored in areas corresponding to the addressessupplied from address converter 8012 and address converter 8013, asfrequency-dependent spectral gains.

The conventional noise suppressor has been described above. With theconventional noise suppressor described above, the power spectrum ofnoise is updated in a silent section based on the output of the voiceactivity detector. Therefore, if the detected result from the voiceactivity detector is incorrect, then it is unable to estimate the powerspectrum of noise accurately. When a speech section continues for a longtime, since no silent section exists, the power spectrum of noise cannotbe updated, and hence the accuracy with which to estimate the powerspectrum of nonstationary noise is inevitably lowered. Accordingly, theconventional noise suppressor has residual noise and distortion in theenhanced speech.

According to the conventional suppression algorithm, the power spectrumof noise is estimated using the power spectrum of noisy speech. With theconventional algorithm, therefore, the power spectrum of noise cannot beestimated accurately under the influence of the power spectrum of speechcontained in the noisy speech, so that noise tends to remain anddistortion tends to be introduced in the enhanced speech. According tothe conventional noise suppression algorithm, furthermore, because noisesuppression is carried out using spectral gains determined by the samecalculation method independent of the SNR, a sufficiently high qualitycannot be achieved for the enhanced speech.

It is an object of the present invention to provide a method of noisesuppression to produce enhanced speech with reduced distortion and noiseby accurately estimating the power spectrum of noise independent of theperformance of a voice activity detector.

Another object of the present invention is to provide an apparatus fornoise suppression to produce enhanced speech with reduced distortion andnoise by accurately estimating the power spectrum of noise without beinggoverned by the performance of a voice activity detector.

Still another object of the present invention is to provide a method ofnoise suppression to produce enhanced speech suffering with reduceddistortion and noise by accurately estimating the power spectrum ofnoise even in a speech section when the noise is nonstationary.

Yet still another object of the present invention is to provide anapparatus for noise suppression to produce enhanced speech with reduceddistortion and noise by accurately estimating the power spectrum ofnoise even in a speech section when the noise is nonstationary.

A further object of the present invention is to provide a method ofnoise suppression to produce enhanced speech with reduced distortion andnoise by using optimum spectral gains with respect to all SNR values.

A still further object of the present invention is to provide anapparatus for noise suppression to produce enhanced speech with reduceddistortion and noise by using optimum spectral gains with respect to allSNR values.

DISCLOSURE OF THE INVENTION

According to a first aspect of the present invention, there is provideda method of noise suppression, comprising the steps of converting aninput signal into a frequency-domain and determining a signal-to-noiseratio based on a frequency-domain signal, determining a spectral gainbased on the signal-to-noise ratio, correcting the spectral gain toproduce a modified spectral gain, weighting the frequency-domain signalusing the modified spectral gain, and converting the weightedfrequency-domain signal into a time-domain signal to produce an outputsignal where noise has been suppressed.

According to a second aspect of the present invention, there is providedan apparatus for noise suppression, comprising a signal-to-noise ratiocalculator for converting an input signal into a frequency-domain anddetermining a signal-to-noise ratio using a frequency-domain signal, aspectral gain generator for determining a spectral gain based on thesignal-to-noise ratio, a spectral gain modification unit for correctingthe spectral gain to produce a modified spectral gain, a multiplier forweighting the frequency-domain signal using the modified spectral gain,and an inverse converter for converting the weighted frequency-domainsignal into a time-domain signal.

In the above method of and apparatus for noise suppression, noise issuppressed using a spectral gain modified depending on the value of asignal-to-noise ratio (SNR). Specifically, the apparatus for noisesuppression has the spectral gain modification unit which receives thevalue of the SNR and the spectral gain and calculates a modifiedspectral gain. By suppressing noise using the spectral gain modifieddepending on the value of the SNR, it is possible according to thepresent invention to obtain enhanced speech suffering little distortionand noise with respect to all SNR values.

According to a third aspect of the present invention, there is provideda method of noise suppression, comprising the steps of converting aninput signal into a frequency-domain and weighting a frequency-domainsignal to determine a weighted frequency-domain signal, estimating noiseusing the weighted frequency-domain signal, determining asignal-to-noise ratio using the estimated noise and the frequency-domainsignal, determining a spectral gain based on the signal-to-noise ratio,weighting the frequency-domain signal using the spectral gain, andconverting the weighted frequency-domain signal into a time-domainsignal to produce an output signal where noise has been suppressed.

According to a fourth aspect of the present invention, there is providedan apparatus for noise suppression, at least comprising asignal-to-noise ratio calculator for converting an input signal into afrequency-domain and determining a signal-to-noise ratio using afrequency-domain signal, a spectral gain generator for determining aspectral gain based on the signal-to-noise ratio, a multiplier forweighting the frequency-domain signal using the spectral gain, and aninverse converter for converting the weighted frequency-domain signalinto a time-domain signal, wherein the signal-to-noise ratio calculatorincludes a weighted frequency-domain signal calculator for weighting thefrequency-domain signal to determine a weighted frequency-domain signal,and a noise estimation unit for estimating noise using the weightedfrequency-domain signal.

In the above method of and apparatus for noise suppression, the powerspectrum of noise is estimated using a weighted frequency-domain signal,i.e., a weighted noisy speech power spectrum. More specifically, theapparatus for noise suppression has the weighted frequency-domain signalcalculator, i.e., a weighted noisy speech calculator, which calculates aweighted noisy speech power spectrum from a noisy speech power spectrumand an estimated noise power spectrum. Since a noise power spectrum in apresent frame is estimated using a weighted noisy speech power spectrumwhich is determined from a noisy speech power spectrum and an estimatednoise power spectrum in a preceding frame, it is possible to estimatethe power spectrum of noise accurately regardless of the nature ofnoise, thus producing enhanced speech suffering little distortion andnoise.

According to a fifth aspect of the present invention, there is provideda method of estimating noise, comprising the steps of determining asignal-to-noise ratio using an input signal and estimated noise,determining a weight using the signal-to-noise ratio, weighting theinput signal with the weight to determine a weighted input signal, anddetermining estimated noise based on the weighted input signal.

According to a sixth aspect of the present invention, there is providedan apparatus for estimating noise, comprising a signal-to-noisecalculator for determining a signal-to-noise ratio using an input signaland estimated noise, a weight calculator for determining a weight basedon the signal-to-noise ratio, an input signal calculator for weightingthe input signal with the weight to determine a weighted input signal,and a noise estimation unit for determining estimated noise based on theweighted input signal.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram showing an arrangement of a conventional noisesuppressor;

FIG. 2 is a block diagram showing an arrangement of a voice activitydetector included in the noise suppressor shown in FIG. 1;

FIG. 3 is a block diagram showing an arrangement of a power calculatorincluded in the voice activity detector shown in FIG. 2;

FIG. 4 is a block diagram showing an arrangement of a weighted adderincluded in the voice activity detector shown in FIG. 2;

FIG. 5 is a block diagram showing an arrangement of a multiplexedmultiplier included in the noise suppressor shown in FIG. 1;

FIG. 6 is a block diagram showing an arrangement of a noise estimationunit included in the noise suppressor shown in FIG. 1;

FIG. 7 is a block diagram showing an arrangement of afrequency-dependent noise estimation unit included in the noiseestimation unit shown in FIG. 6;

FIG. 8 is a block diagram showing an arrangement of an update decisionunit included in the frequency-dependent noise estimation unit shown inFIG. 7;

FIG. 9 is a block diagram showing an arrangement of afrequency-dependent SNR calculator included in the noise suppressorshown in FIG. 1;

FIG. 10 is a block diagram showing an arrangement of an a-priori SNRestimator included in the noise suppressor shown in FIG. 1;

FIG. 11 is a block diagram showing an arrangement of a multiplexed rangelimitation processor included in the a-priori SNR estimator shown inFIG. 10;

FIG. 12 is a block diagram showing an arrangement of a multiplexedweighted adder included in the a-priori SNR estimator shown in FIG. 10;

FIG. 13 is a block diagram showing an arrangement of a spectral gaingenerator included in the noise suppressor shown in FIG. 1;

FIG. 14 is a block diagram showing an arrangement of a spectral gainsearch unit included in the spectral gain generator shown in FIG. 13;

FIG. 15 is a block diagram showing an arrangement of a noise suppressoraccording to a first embodiment of the present invention;

FIG. 16 is a block diagram showing an arrangement of a weighted noisyspeech calculator included in the noise suppressor shown in FIG. 15;

FIG. 17 is a block diagram showing an arrangement of a multiplexednonlinear processor included in the weighted noisy speech calculator;

FIG. 18 is a graph showing an example of a nonlinear function used bythe multiplexed nonlinear processor;

FIG. 19 is a block diagram showing an arrangement of a noise estimationunit included in the noise suppressor shown in FIG. 15;

FIG. 20 is a block diagram showing an arrangement of afrequency-dependent noise estimation unit included in the noiseestimation unit shown in FIG. 19;

FIG. 21 is a block diagram showing an arrangement of an update decisionunit included in the frequency-dependent noise estimation unit shown inFIG. 20;

FIG. 22 is a block diagram showing a second example of the arrangementof a frequency-dependent noise estimation unit included in the noiseestimation unit shown in FIG. 19;

FIG. 23 is a block diagram showing an arrangement of a spectral gainmodification unit included in the noise suppressor shown in FIG. 15;

FIG. 24 is a block diagram showing an arrangement of afrequency-dependent spectral gain modification unit included in thespectral gain modification unit shown in FIG. 23;

FIG. 25 is a block diagram showing a second example of an arrangement ofa spectral gain generator;

FIG. 26 is a block diagram showing an arrangement of afrequency-band-dependent SNR calculator that can be used instead of afrequency-dependent SNR calculator in the noise suppressor shown in FIG.15;

FIG. 27 is a block diagram showing an arrangement of afrequency-band-dependent power calculator included in thefrequency-band-dependent SNR calculator shown in FIG. 26;

FIG. 28 is a block diagram showing an arrangement of a noise suppressoraccording to a second embodiment of the present invention;

FIG. 29 is a block diagram showing an arrangement of a noise estimationunit included in the noise suppressor shown in FIG. 28;

FIG. 30 is a block diagram showing an arrangement of afrequency-dependent noise estimation unit included in the noiseestimation unit shown in FIG. 29;

FIG. 31 is a block diagram showing an arrangement of a noise suppressoraccording to a third embodiment of the present invention;

FIG. 32 is a block diagram showing an arrangement of an a-priori SNRestimator included in the noise suppressor shown in FIG. 31;

FIG. 33 is a block diagram showing an arrangement of a noise suppressoraccording to a fourth embodiment of the present invention;

FIG. 34 is a block diagram showing an arrangement of a noise suppressoraccording to a fifth embodiment of the present invention;

FIG. 35 is a block diagram showing an arrangement of a noise estimationunit included in the noise suppressor shown in FIG. 34;

FIG. 36 is a block diagram showing an arrangement of afrequency-dependent noise estimation unit included in the noiseestimation unit shown in FIG. 35; and

FIG. 37 is a block diagram showing an arrangement of an update decisionunit included in the frequency-dependent noise estimation unit shown inFIG. 36.

BEST MODE FOR CARRYING OUT THE INVENTION

A noise suppressor according to a first embodiment of the presentinvention shown in FIG. 15 is similar to the conventional noisesuppressor shown in FIG. 1, but differs in that a noise estimation unithas a different internal structure, and weighted noisy speech calculator14 and spectral gain modification unit 15 are added. Specifically, thenoise suppressor according to the first embodiment has noise estimationunit 5 instead of noise estimation unit 51 in the noise suppressor shownin FIG. 1. Weighted noisy speech calculator 14 calculates a weightednoisy speech power spectrum from a noisy speech power spectrum and anestimated noise power spectrum, and outputs the calculated weightednoisy speech power spectrum to noise estimation unit 5. Spectral gainmodification unit 15 calculates a modified spectral gain based on aspectral gain and an estimated a-priori SNR. Multiplexed multiplier 16and a-priori SNR estimator 7 are supplied with the modified spectralgain instead of the spectral gain which is generated by spectral gaingenerator 8. Voice activity detector 4, noise estimation unit 5,frequency-dependent SNR calculator 6, counter 13, weighted noisy speechcalculator 14, and multiplexed multiplier 17 jointly make up SNR(signal-to-noise ratio) calculator 101. A-priori SNR estimator 7 andspectral gain generator 8 jointly make up spectral gain generation unit102.

In the following description, those components which are indicated byreference characters that are identical to those shown in FIGS. 1 to 14are identical to those shown in FIGS. 1 to 14. The noise suppressoraccording to the present embodiment will be described below basicallywith respect to its differences from the conventional noise suppressor.

As shown in FIG. 16, weighted noisy speech calculator 14 has estimatednoise memory 1401, frequency-dependent SNR calculator 1402, multiplexednonlinear processor 1405, and multiplexed multiplier 1404. Estimatednoise memory 1401 stores the estimated noise power spectrum suppliedfrom noise estimation unit 5 (FIG. 15), and outputs a stored estimatednoise power spectrum in a previous frame to frequency-dependent SNRcalculator 1402. Frequency-dependent SNR calculator 1402 calculates anSNR per frequency using the estimated noise power spectrum supplied fromestimated noise memory 1401 and the noisy speech power spectrum suppliedfrom multiplexed multiplier 17 (FIG. 15), and outputs the calculated SNRto multiplexed nonlinear processor 1405. Multiplexed nonlinear processor1405 calculates a weighting factor vector using the SNR supplied fromfrequency-dependent SNR calculator 1402, and outputs the weightingfactor vector to multiplexed multiplier 1404. Multiplexed multiplier1404 calculates the product, per frequency, of the noisy speech powerspectrum supplied from multiplexed multiplier 17 (FIG. 16) and theweighting factor vector supplied from multiplexed nonlinear processor1405, and outputs a weighted noisy speech power spectrum to estimatednoise memory 5 (FIG. 15). The weighted noisy speech power spectrumcorresponds to a weighted amplitude component.

In weighted noisy speech calculator 14, frequency-dependent SNRcalculator 1402 is identical in arrangement to frequency-dependent SNRcalculator 6 described above with reference to FIG. 9, and multiplexedmultiplier 1404 is identical in arrangement to multiplexed multiplier 17described above with reference to FIG. 5. Therefore, these will not bedescribed in detail below.

Structural details and operation of multiplexed nonlinear processor 1405included in weighted noisy speech calculator 14 will be described indetail below with reference to FIG. 17. As shown in FIG. 17, multiplexednonlinear processor 1405 has demultiplexer 1475, K nonlinear processors1485 ₀ to 1485 _(K−1), and multiplexer 1495. Demultiplexer 1475separates the SNR supplied from frequency-dependent SNR calculator 1402(FIG. 16) into frequency-dependent SNRs, and outputs thefrequency-dependent SNRs respectively to nonlinear processors 1485 ₀ to1485 _(K−1). Nonlinear processors 1485 ₀ to 1485 _(K−1) outputs realvalued numbers depending on the input values based on a nonlinearfunction. FIG. 18 shows an example of the nonlinear function. When aninput value is represented by f₁, the nonlinear function shown in FIG.18 has an output value f₂ expressed by equation (15):

$\begin{matrix}{f_{2} = \left\{ \begin{matrix}{1,} & {f_{1} \leq a} \\{\frac{f_{1} - b}{a - b},} & {a < f_{1} \leq b} \\{0,} & \text{otherwise}\end{matrix} \right.} & (15)\end{matrix}$

Each of nonlinear processors 1485 ₀ to 1485 _(K−1) processes thefrequency-dependent SNR supplied from demultiplexer 1495 with thenonlinear function to determine weighting factor, and output theweighting factor to multiplexer 1475. Specifically, nonlinear processors1485 ₀ to 1485 _(K−1) output weighting factors ranging from 1 to 0depending on the SNRs such that they output 1 when the SNR is small andoutput 0 when the SNR is large. Multiplexer 1475 multiplexes theweighting factors output from nonlinear processors 1485 ₀ to 1485 _(K−1)and output a weighting factor vector to multiplexed multiplier 1404.

The weighting factors by which the noisy speech power spectrum is to bemultiplied by multiplexed multiplier 1404 (FIG. 16) are of valuesdepending on the SNRs. The weighting factors have smaller values as theSNRs are larger, i.e., the noisy speech contains a greater speechcomponent. Estimated noise is updated using a noisy speech powerspectrum in general. By weighting the noisy speech power spectrum usedto update the estimated noise with the SNR, the influence of the speechcomponent contained in the noisy speech power spectrum can be reducedfor estimating noise with higher accuracy. While a nonlinear function isused to calculate weighting factors in this example, it is possible touse an SNR function expressed in another form than the nonlinearfunction, such as a linear function or a higher-degree polynomial.

FIG. 19 shows an arrangement of noise estimation unit 5 included in thenoise suppressor. Noise estimation unit 5 is similar to noise estimationunit 51 used in the conventional noise suppressor shown in FIG. 6,except that it has demultiplexer 505, and frequency-dependent noiseestimation units 514 ₀ to 514 _(K−1) are replaced withfrequency-dependent noise estimation units 504 ₀ to 504 _(K−1). Noiseestimation unit 5 will be described below basically with respect tothese differences.

Demultiplexer 505 splits the weighted noisy speech power spectrumsupplied from weighted noisy speech calculator 14 (FIG. 15) intofrequency-dependent weighted noisy speech power spectrum, and output thefrequency-dependent weighted noisy speech power spectrum respectively tofrequency-dependent noise estimation units 504 ₀ to 504 _(K−1).Frequency-dependent noise estimation units 504 ₀ to 504 _(K−1)calculates frequency-dependent estimated noise power spectrum from thefrequency-dependent noisy speech power spectrum supplied fromdemultiplexer 502, the frequency-dependent weighted noisy speech powerspectrum supplied from demultiplexer 505, the voice activity detectionflag supplied from voice activity detector 4 (FIG. 15), and the countvalue supplied from counter 13 (FIG. 15), and output the calculatedfrequency-dependent estimated noise power spectrum to multiplexer 503.Multiplexer 503 multiplexes the frequency-dependent estimated noisepower spectrum supplied from frequency-dependent noise estimation units504 ₀ to 504 _(K−1), and outputs a resultant estimated noise powerspectrum to frequency-dependent SNR calculator 6 (FIG. 15) and weightednoisy speech calculator 14 (FIG. 15). An arrangement offrequency-dependent noise estimation units 504 ₀ to 504 _(K−1) will bedescribed below.

FIG. 20 shows an arrangement of frequency-dependent noise estimationunits 504 ₀ to 504 _(K−1). Since frequency-dependent noise estimationunits 504 ₀ to 504 _(K−1) are identical in arrangement to each other,they are indicated as frequency-dependent noise estimation unit 504 inFIG. 20. Frequency-dependent noise estimation unit 504 used hereindiffers from frequency-dependent noise estimation unit 514 shown in FIG.7 in that frequency-dependent noise estimation unit 504 has estimatednoise memory 5942, update decision unit 521 is replaced with updatedecision unit 520, and a frequency-dependent weighted noisy speech powerspectrum, rather than the frequency-dependent noisy speech powerspectrum, is supplied to switch 5044. These differences occur becausefrequency-dependent noise estimation units 504 ₀ to 504 _(K−1) use theweighted noisy speech power spectrum, rather than the noisy speech powerspectrum, in calculating estimated noise, and use estimated noise andnoisy speech power spectrum in determining the updating of estimatednoise. Estimated noise memory 5942 stores the frequency-dependentestimated noise power spectrum supplied from divider 5048 and outputsstored frequency-dependent estimated noise power spectrum in a previousframe to update decision unit 520.

FIG. 21 shows an arrangement of update decision unit 520. Updatedecision unit 520 differs from update decision unit 521 shown in FIG. 8in that update decision unit 520 has comparator 5205, threshold memory5206, and threshold calculator 5207, and OR circuit 5211 is replacedwith OR circuit 5201. Update decision unit 520 will be described belowbasically with respect to these differences.

Threshold calculator 5207 calculates a value depending on thefrequency-dependent estimated noise power spectrum supplied fromestimated noise memory 5942 (FIG. 20), and outputs the calculated valueas a threshold value to threshold memory 5206. According to the simplestprocess of calculating a threshold value, a multiple of thefrequency-dependent estimated noise power spectrum by a constant is usedas a threshold value. According to another process, a threshold valuemay be calculated using a higher-degree polynomial or a nonlinearfunction. Threshold memory 5206 stores a threshold value output fromthreshold calculator 5207, and outputs a stored threshold value in aprevious frame to comparator 5205. Comparator 5205 compares thethreshold value supplied from threshold memory 5206 with thefrequency-dependent noisy speech spectrum supplied from demultiplexer502 (FIG. 19). If the frequency-dependent noisy speech spectrum issmaller than the threshold value, comparator 5205 outputs “1” to ORcircuit 5201. If the frequency-dependent noisy speech spectrum isgreater than the threshold value, comparator 5205 outputs “0” to ORcircuit 5201. Thus, comparator 5205 determines whether the noisy speechsignal is noise or not based on the magnitude of the estimated noisepower spectrum. OR circuit 5201 calculates logical sum of the outputfrom comparator 5203, the output from NOT circuit 5202, and the outputfrom comparator 5205, and outputs the result to switch 5044, shiftregister 5045, and counter 5049 (FIG. 20).

Update decision unit 520 thus outputs “1”, thereby updating estimatednoise, if the noisy speech power is small not only in an initial stateand a silent section, but also in a speech section. Since a thresholdvalue is calculated for each frequency, estimated noise can be updatedfor each frequency.

In FIG. 20, it is assumed that counter 5049 has a count value CNT, shiftregister 5045 has a register length N, and shift register 5045 storesfrequency-dependent weighted noisy speech power spectrum B_(n)(k) (n=0,1, . . . , N−1). The frequency-dependent estimated noise power spectrumλ_(n)(k) supplied from divider 5048 is expressed by equation (16):

$\begin{matrix}{{\lambda_{n}(k)} = \left\{ \begin{matrix}{{\frac{1}{CNT}{\sum\limits_{n = 0}^{{CNT} - 1}{B_{n}(k)}}},} & {{CNT} < N} \\{{\frac{1}{N}{\sum\limits_{n = 0}^{N - 1}{B_{n}(k)}}},} & \text{otherwise}\end{matrix} \right.} & (16)\end{matrix}$

In other words, the frequency-dependent estimated noise power spectrumλ_(n)(k) represents the average value of the frequency-dependentweighted noisy speech power spectrum stored in shift register 5045. Theaverage value may be calculated using a weighted adder (recursivefilter). An arrangement which employs a weighted adder to calculate thefrequency-dependent estimated noise power spectrum λ_(n)(k) will bedescribed below.

FIG. 22 shows an arrangement of a second example of frequency-dependentnoise estimation units 504 ₀ to 504 _(K−1). Since frequency-dependentnoise estimation units 504 ₀ to 504 _(K−1) are identical in arrangementto each other, they are indicated as frequency-dependent noiseestimation unit 507 in FIG. 22. Frequency-dependent noise estimationunit 507 shown in FIG. 22 has weighted adder 5071 and weight memory 5072which are added in place of shift register 5045, adder 5046, minimumvalue selector 5047, divider 5048, counter 5049, and register lengthmemory 5941 in frequency-dependent noise estimation unit 504 shown inFIG. 20.

Weighted adder 5071 calculates frequency-dependent estimated noise usingthe frequency-dependent estimated noise power spectrum in the previousframe supplied from estimated noise memory 5942, the frequency-dependentweighted noisy speech power spectrum supplied from switch 5044, and theweighting factor output from weight memory 5072, and outputs thecalculated frequency-dependent estimated noise to multiplexer 503.Specifically, if the weighting factor stored in weight memory 5072 isrepresented by δ and the frequency-dependent weighted noisy speech powerspectrum is represented by | Y _(n)(k)|², then the frequency-dependentestimated noise power spectrum λ_(n)(k) output from weighted adder 5071is expressed by equation (17). Since weighted adder 5071 is identical inarrangement to weighted adder 407 described above with reference to FIG.4, weighted adder 5071 will not be described in detail. However, theweighted addition is calculated at all times in weighted adder 5071.λ_(n)(k)=δλ_(n−1)(k)+(1−δ)| Y _(n)(k)|²  (17)

Spectral gain modification unit 15 in the noise suppressor shown in FIG.15 will be described below. Spectral gain modification unit 15 modifiesa spectral gain depending on the SNR in order to prevent residual noisewhich would be introduced due to insufficient suppression when the SNRis low, and also to prevent speech quality degradation due to speechdistortion which would occur owing to excessive suppression when the SNRis high. As an example of the spectral gain modification, when the SNRis low, a modification value is added to a spectral gain to suppressresidual noise, and when the SNR is high, a minimum value of a spectralgain is limited to prevent speech distortion. As shown in FIG. 23,spectral gain modification unit 15 has K frequency-dependent spectralgain modification units 1501 ₀ to 1501 _(K−1), demultiplexers 1502,1503, and multiplexer 1054.

Demultiplexer 1502 separates the estimated a-priori SNR supplied froma-priori SNR estimator 7 (FIG. 15) into frequency-dependent components,and outputs the frequency-dependent components respectively tofrequency-dependent spectral gain modification units 1501 ₀ to 1501_(K−1). Demultiplexer 1503 separates the spectral gain supplied fromspectral gain generator 8 (FIG. 15) into frequency-dependent components,and outputs the frequency-dependent components respectively tofrequency-dependent spectral gain modification units 1501 ₀ to 1501_(K−1). Each of frequency-dependent spectral gain modification units1501 ₀ to 1501 _(K−1) calculates a frequency-dependent modified spectralgain from the frequency-dependent estimated a-priori SNR supplied fromdemultiplexer 1502 and the frequency-dependent spectral gain suppliedfrom demultiplexer 1503, and output the calculated frequency-dependentmodified spectral gain to multiplexer 1504. Multiplexer 1504 multiplexesthe frequency-dependent modified spectral gains supplied fromfrequency-dependent spectral gain modification units 1501 ₀ to 1501_(K−1), and output a multiplexed modified spectral gain to multiplexedmultiplier 16 and a-priori SNR estimator 7.

FIG. 24 shows an arrangement of frequency-dependent spectral gainmodification units 1501 ₀ to 1501 _(K−1). Since frequency-dependentspectral gain modification units 1501 ₀ to 1501 _(K−1) are identical inarrangement to each other, they are indicated as frequency-dependentspectral gain modification unit 1501 in FIG. 24. Frequency-dependentspectral gain modification unit 1501 has maximum value selector 1591,spectral gain lower limit memory 1592, threshold memory 1593, comparator1594, switch (selector) 1595, modification value memory 1596, andmultiplier 1597.

Comparator 1594 compares a threshold value supplied from thresholdmemory 1593 and the frequency-dependent estimated a-priori SNR suppliedfrom demultiplexer 1502 (FIG. 23) with each other. If thefrequency-dependent estimated a-priori SNR is greater than the thresholdvalue, then comparator 1594 supplies “0” to switch 1595. If thefrequency-dependent estimated a-priori SNR is smaller than the thresholdvalue, then comparator 1594 supplies “1” to switch 1595. Switch 1595outputs the signal supplied from demultiplexer 1503 (FIG. 23) tomultiplier 1597 when the output from comparator 1594 is “1”. Switch 1595outputs the signal supplied from demultiplexer 1503 to maximum valueselector 1591 when the output from comparator 1594 is “0”. That is, whenthe frequency-dependent estimated a-priori SNR is smaller than thethreshold value, the spectral gain is modified. As the spectral gain ismodified when the SNR is small, the speech component is not excessivelysuppressed, and the amount of residual noise is reduced. Multiplier 1579calculates the product of the output value from switch 1595 and theoutput value from modification value memory 1596, and outputs thecalculated result to maximum value selector 1591. In order to reduce thespectral gain value, the modification value is normally smaller than 1.However, the modification value may be selected otherwise depending onthe purpose of the noise suppressor. In the conventional noisesuppressor shown in FIG. 1, the spectral gain is supplied to multiplexedmultiplier 16 and a-priori SNR estimator 7. In the noise suppressoraccording to the first embodiment, however, the modified spectral gain,rather than the spectral gain, is supplied to multiplexed multiplier 16and a-priori SNR estimator 7.

Spectral gain lower limit memory 1592 supplies a stored lower limit forthe spectral gain to maximum value selector 1591. Maximum value selector1591 compares the frequency-dependent spectral gain supplied from switch1595 and the spectral gain lower limit value supplied from spectral gainlower limit memory 1592 with each other, and outputs a larger one ofthem to multiplexer 1504 (FIG. 23). That is, the spectral gain is alwayslarger than the lower limit stored in spectral gain lower limit memory1592. Therefore, speech distortion due to excessive suppression isprevented.

FIG. 25 shows a second example of the arrangement of the spectral gaingenerator included in the noise suppressor shown in FIG. 15. Spectralgain generator 81 illustrated herein has MMSE STSA gain function valuecalculator 811, generalized likelihood ratio calculator 812, speechpresence probability memory 813, and spectral gain calculator 814.Spectral gain generator 81 differs from spectral gain generator 8 shownin FIG. 15 which determines a spectral gain through search, in thatnoise spectral gain generator 81 calculates a spectral gain from anestimated a-priori SNR and an a-posteriori SNR that are suppliedthereto. A process of calculating a spectral gain based on equationsdescribed in Reference 1 will be described below.

It is assumed that a frame number is represented by n, a frequencynumber is represented by k, γ_(n)(k) represents the frequency-dependenta-posteriori SNR supplied from frequency-dependent SNR calculator 6(FIG. 15), and {circumflex over (ξ)}_(n)(k) represents thefrequency-dependent estimated a-priori SNR supplied from a-priori SNRestimator 7 (FIG. 15). It is also assumed that:η_(n)(k)={circumflex over (ξ)}_(n)(k)/q, andν_(n)(k)=η_(n)(k)·γ_(n)(k)/(1+η_(n)(k))ν_(n)(k)

MMSE STSA gain function value calculator 811 calculates MMSE STSA gainfunction values for respective frequencies based on the a-posteriori SNRsupplied from frequency-dependent SNR calculator 6, the estimateda-priori SNR supplied from a-priori SNR estimator 7, and a speechpresence probability q supplied from speech presence probability memory813, and outputs the calculated MMSE STSA gain function values tospectral gain calculator 814. The MMSE STSA gain function valuesG_(n)(k) for the respective frequencies are given by equation (18). Inequation (18), I₀(z) represents the 0^(th)-order modified Besselfunction, and I₁(z) represents the 1^(st)-order modified Besselfunction. The modified Bessel functions are described in “Dictionary ofmathematics”, 1985, Iwanami Shoten, page 374 G (Reference 5).

$\begin{matrix}{{G_{n}(k)} = {\frac{\sqrt{\pi}}{2}\frac{\sqrt{v_{n}(k)}}{\gamma_{n}(k)}{{\exp\left( {- \frac{v_{n}(k)}{2}} \right)} \cdot \left\lbrack {{\left( {1 + {v_{n}(k)}} \right){I_{0}\left( \frac{v_{n}(k)}{2} \right)}} + {{v_{n}(k)}{I_{1}\left( \frac{v_{n}(k)}{2} \right)}}} \right\rbrack}}} & (18)\end{matrix}$

Generalized likelihood ratio calculator 812 calculates generalizedlikelihood ratios for respective frequencies based on the a-posterioriSNR γ_(n)(k) supplied from frequency-dependent SNR calculator 6, theestimated a-priori SNR {circumflex over (ξ)}_(n)(k) supplied froma-priori SNR estimator 7, and the speech presence probability q suppliedfrom speech presence probability memory 813, and outputs the calculatedgeneralized likelihood ratios to spectral gain calculator 814. Thegeneralized likelihood ratios Λ_(n)(k) for the respective frequenciesare expressed by equation (19):

$\begin{matrix}{{\Lambda_{n}(k)} = {\frac{q}{1 - q}\frac{\exp\left( {v_{n}(k)} \right)}{1 + {\eta_{n}(k)}}}} & (19)\end{matrix}$

Spectral gain calculator 814 calculates spectral gains for respectivefrequencies from the MMSE STSA gain function values G_(n)(k) suppliedfrom MMSE STSA gain function value calculator 811 and the generalizedlikelihood ratios Λ_(n)(k) supplied from generalized likelihood ratiocalculator 812, and outputs the calculated spectral gains to spectralgain modification unit 15 (FIG. 15). The spectral gains G _(n)(k) forthe respective frequencies are expressed by equation (20):

$\begin{matrix}{{{\overset{\_}{G}}_{n}(k)} = {\frac{\Lambda_{n}(k)}{{\Lambda_{n}(k)} + 1}{G_{n}(k)}}} & (20)\end{matrix}$

In the noise suppressor shown in FIG. 15, it is possible to determineand use common SNRs for respective frequency bands comprising aplurality of frequencies, rather than frequency-dependent SNRs. A secondexample of frequency-dependent SNR calculator 6 for calculating SNRs forrespective bands will be described below.

FIG. 26 shows an arrangement of frequency-band-dependent SNR calculator61 that can be used instead of frequency-dependent SNR calculator 6 inthe noise suppressor shown in FIG. 15. Frequency-band-dependent SNRcalculator 61 differs from frequency-dependent SNR calculator 6 shown inFIG. 9 in that it has frequency-band-dependent power calculators 611,612. Frequency-band-dependent power calculator 611 calculatesfrequency-band-dependent powers based on the frequency-dependent noisyspeech power spectrum supplied from demultiplexer 602, and outputs thecalculated frequency-band-dependent powers to dividers 601 ₀ to 601_(K−1), respectively. Frequency-band-dependent power calculator 612calculates frequency-band-dependent powers based on thefrequency-dependent estimated noise power spectrum supplied fromdemultiplexer 603, and outputs the calculated frequency-band-dependentpowers to dividers 601 ₀ to 601 _(K−1), respectively.

FIG. 27 shows an arrangement of frequency-band-dependent powercalculator 611. In the illustrated example, the entire power spectrum isdivided into equal M bands having a bandwidth L where L, M are naturalnumbers satisfying the relationship K=LM.

Frequency-band-dependent power calculator 611 has M adders 6110 ₀ to6110 _(M−1). Frequency-dependent noisy speech power spectrum components910 ₀ to 910 _(K−1) (910 ₀ to 910 _(ML−1)) supplied from demultiplexer602 (FIG. 26) are transmitted respectively to adders 6110 ₀ to 6110_(M−1) which correspond to the respective frequencies. Since thefrequency numbers corresponding to the frequency band number 0 are 0 toL−1, for example, frequency-dependent noisy speech power spectrumcomponents 910 ₀ to 910 _(L−1) are transmitted to adder 6110 ₀.Similarly, since the frequency numbers corresponding to the frequencyband number 1 are L to 2L−1, for example, frequency-dependent noisyspeech power spectrum components 910 _(L) to 9102 _(L−1) are transmittedto adder 6110 ₁. Adders 6110 ₀ to 6110 _(M−1) calculate respective sumsof supplied frequency-dependent noisy speech power spectrum components,and output frequency-band-dependent noisy speech power spectrumcomponents 911 ₀ to 911 _(ML−1) (911 ₀ to 911 _(K−1)) to dividers 601 ₀to 601 _(K−1) (FIG. 26). The calculated results from adders 6110 ₀ to6110 _(M−1) are supplied as frequency-band-dependent noisy speech powerspectrum components for frequencies depending on respective frequencyband numbers. For example, the calculated results from adder 6110 ₀ areoutput as frequency-band-dependent noisy speech power spectrumcomponents 911 ₀ to 911 _(L−1), and the calculated results from adder6110 ₁ are output as frequency-band-dependent noisy speech powerspectrum components 911 _(L) to 911 _(2L−1).

Frequency-band-dependent power calculator 612 is equivalent inarrangement and operation to frequency-band-dependent power calculator611. Therefore, frequency-band-dependent power calculator 612 will notbe described in detail below.

While the entire power spectrum is divided into a plurality of frequencybands described earlier, it is possible to employ another frequency banddividing method such as a method for dividing the entire power spectruminto critical bands as disclosed in “Hearing and speech”, The Instituteof Electronics, Information, and Communication Engineers, pages 115-118,1980 (Reference 6), or a method for dividing the entire power spectruminto octave bands as disclosed in “Multirate Digital Signal Processing”,1983, Prentice-Hall Inc., USA, 1983 (Reference 7).

A second embodiment of the present invention will be described below. Anoise suppressor according to the second embodiment shown in FIG. 28differs from the noise suppressor according to the first embodimentshown in FIG. 15 in that noise estimation unit 5 is replaced with noiseestimation unit 52 and weighted noisy speech calculator 14 is dispensedwith. The noise suppressor according to the second embodiment will bedescribed below basically with respect to these differences.

FIG. 29 shows an arrangement of noise estimation unit 52 included in thenoise suppressor according to the second embodiment. Noise estimationunit 52 differs from noise estimation unit 5 shown in FIG. 19 in thatfrequency-dependent noise estimation units 504 ₀ to 504 _(K−1) arereplaced with frequency-dependent noise estimation units 506 ₀ to 506_(K−1) and an input signal for noise estimation unit 52 does not have aweighted noisy speech power spectrum. This is because whereasfrequency-dependent noise estimation units 504 ₀ to 504 _(K−1) in noiseestimation unit 5 shown in FIG. 19 require the input signal to have afrequency-dependent weighted noisy speech power spectrum, noiseestimation units 506 ₀ to 506 _(K−1) in noise estimation unit 52 do notrequire the input signal to have a frequency-dependent weighted noisyspeech power spectrum.

FIG. 30 is a block diagram showing an arrangement of frequency-dependentnoise estimation units 506 ₀ to 506 _(K−1) included in noise estimationunit 52 shown in FIG. 29. Since frequency-dependent noise estimationunits 506 ₀ to 506 _(K−1) are identical in arrangement to each other,they are indicated as frequency-dependent noise estimation unit 506 inFIG. 30. Frequency-dependent noise estimation unit 506 differs fromfrequency-dependent noise estimation unit 504 shown in FIG. 20 in thatit does not use an input signal having a weighted noisy speech powerspectrum and it has divider 5041, nonlinear processor 5042, andmultiplier 5043. Frequency-dependent noise estimation unit 506 will bedescribed below basically with respect to these differences.

Divider 5041 divides the frequency-dependent noisy speech power spectrumsupplied from demultiplexer 502 (FIG. 29) by the estimated noise powerspectrum in the previous frame which is supplied from estimated noisememory 5942, and outputs the divided result to nonlinear processor 5042.Nonlinear processor 5042, which is identical in arrangement and functionto nonlinear processor 1485 shown in FIG. 17, calculates a weightingfactor depending on the output from divider 5041, and outputs thecalculated weighting factor to multiplier 5043. Multiplier 5043calculates the product of the frequency-dependent noisy speech powerspectrum supplied from demultiplexer 502 (FIG. 28) and the weightingfactor supplied from nonlinear processor 5042, and outputs the productto switch 5044.

The output signal from multiplier 5043 is equivalent to thefrequency-dependent weighted noisy speech power spectrum components infrequency-dependent noise estimation unit 504 shown in FIG. 18.Specifically, the frequency-dependent weighted noisy speech powerspectrum can be calculated inside frequency-dependent noise estimationunit 506. In the noise suppressor according to the second embodiment,therefore, the weighted noisy speech calculator may be dispensed with.

A third embodiment of the present invention will be described below. Anoise suppressor according to the third embodiment of the presentinvention shown in FIG. 31 differs from the noise suppressor accordingto the first embodiment shown in FIG. 15 in that a-priori SNR estimatorhas a different internal arrangement. FIG. 32 shows an arrangement ofa-priori SNR estimator 71 used in the third embodiment. A-priori SNRestimator 71 differs from a-priori SNR estimator 7 shown in FIG. 10 inthat it has estimated noise memory 712, enhanced speech power spectrummemory 713, frequency-dependent SNR calculator 715, and multiplexedmultiplier 716 in place of a-posteriori SNR memory 702, spectral gainmemory 703, and multiplexed multipliers 705, 704. Furthermore, whereasthe input signal for a-priori SNR estimator 7 shown in FIG. 10 containsa spectral gain, the input signal for a-priori SNR estimator 71 shown inFIG. 32 contains a spectral amplitude of enhanced speech and anestimated noise power spectrum instead of a spectral gain.

Multiplexed multiplier 716 squares the spectral amplitude of enhancedspeech supplied from multiplexed multiplier 16 (FIG. 31) per frequencyto determine an enhanced speech power spectrum, and outputs thedetermined enhanced speech power spectrum to enhanced speech powerspectrum memory 713. Since multiplexed multiplier 716 is equal inarrangement to multiplexed multiplier 17 described above with referenceto FIG. 5, multiplexed multiplier 716 will not be described in detailbelow. Enhanced speech power spectrum memory 713 stores the enhancedspeech power spectrum supplied from multiplexed multiplier 716, andoutputs a stored enhanced speech power spectrum in a previous frame tofrequency-dependent SNR calculator 715. Since frequency-dependent SNRcalculator 715 is equal in arrangement to frequency-dependent SNRcalculator 6 described above with reference to FIG. 9,frequency-dependent SNR calculator 715 will not be described in detailbelow. Estimated noise memory 712 stores the estimated noise powerspectrum supplied from noise estimation unit 5 (FIG. 31), and outputs astored estimated noise power spectrum in a preceding frame tofrequency-dependent SNR calculator 715. Frequency-dependent SNRcalculator 715 calculates SNRs, for respective frequencies, of theenhanced speech power spectrum supplied from enhanced speech powerspectrum memory 713 and the estimated noise power spectrum supplied fromestimated noise memory 712, and outputs the calculated SNRs tomultiplexed weighted adder 707.

The output signal of frequency-dependent SNR calculator 715 in a-prioriSNR estimator 71 shown in FIG. 32 is equivalent to the output signal ofmultiplexed multiplier 705 in a-priori SNR estimator 7 shown in FIG. 10.Therefore, according to the third embodiment, a-priori SNR estimator 7may be replaced with a-priori SNR estimator 71 described above.

A fourth embodiment of the present invention will be described below. Anoise suppressor according to the fourth embodiment of the presentinvention shown in FIG. 33 differs from the noise suppressor accordingto the second embodiment shown in FIG. 28 in that a-priori SNR estimator71 (see FIG. 32) employed in the third embodiment is used as an a-prioriSNR estimator. Noise estimation unit 52 is similar in arrangement andoperation to the one employed in the second embodiment, and a-priori SNRestimator 71 is similar in arrangement and operation to the one employedin the third embodiment. Therefore, the noise suppressor shown in FIG.33 performs functions which are equivalent to the functions of the noisesuppressor according to the first embodiment shown in FIG. 15.

A fifth embodiment of the present invention will be described below. Anoise suppressor according to the fifth embodiment of the presentinvention shown in FIG. 34 differs from the noise suppressor accordingto the first embodiment shown in FIG. 15 in that noise estimation unit 5is replaced with noise estimation unit 53 and voice activity detector 4is dispensed with. Therefore, this noise suppressor is arranged suchthat it does not require a voice activity detector for estimating noise.The noise suppressor according to the fifth embodiment will be describedbelow in detail basically with respect to these differences.

FIG. 35 shows an arrangement of noise estimation unit 53 used in thefifth embodiment. Noise estimation unit 53 differs from noise estimationunit 5 shown in FIG. 19 in that frequency-dependent noise estimationunits 504 ₀ to 504 _(K−1) are replaced with frequency-dependent noiseestimation units 508 ₀ to 508 _(K−1) and the input signal contains novoice activity detection flag.

FIG. 36 shows an arrangement of each of frequency-dependent noiseestimation units 508 ₀ to 508 _(K−1). Since frequency-dependent noiseestimation units 508 ₀ to 508 _(K−1) are identical in arrangement toeach other, they are indicated as frequency-dependent noise estimationunit 508 in FIG. 36. Frequency-dependent noise estimation unit 508differs from frequency-dependent noise estimation unit 504 shown in FIG.20 in that update decision unit 520 is replaced with update decisionunit 522 and the input signal contains no voice activity detection flag.An arrangement of update decision unit 522 is illustrated in FIG. 37.Update decision unit 522 is different from update decision unit 520shown in FIG. 21 in that OR circuit 5201 is replaced with OR circuit5221, NOT circuit 5202 is dispensed with, and the input signal containsno voice activity detection flag. Specifically, update decision unit 522is different from update decision unit 520 shown in FIG. 21 in that itdoes not use a voice activity detection flag in updating estimatednoise. OR circuit 5221 calculates logical sum of the output value fromcomparator 5205 and the output value from comparator 5203, and outputsthe result to switch 5044, shift register 5045, and counter 5049 (FIG.36). Update decision unit 522 outputs “1” at all times until the countvalue reaches a preset value. After the count value reaches the presetvalue, update decision unit 522 outputs “1” when the noisy speech poweris smaller than the threshold value. As described above with referenceto FIG. 21, comparator 5025 determines whether the noisy speech signalis noise or not. That is, comparator 5205 detects speech for eachfrequency. With the above arrangement, therefore, it is possible torealize an update decision unit which does not require a voice activitydetection flag to be contained in the input signal.

The noise suppressors according to the preferred embodiments of thepresent invention have been described above. In the above description,it has been assumed that the minimum mean-square error short-timespectral amplitude is used as a noise suppression algorithm. However,the present invention is also applicable to other noise suppressionalgorithms. One of such noise suppression algorithm is a Wienerfiltering process disclosed in PROCEEDINGS OF THE IEEE, Vol. 67, No. 12,pp. 1586-1604, DECEMBER 1979, (Reference 8).

INDUSTRIAL APPLICABILITY

According to the present invention, as described above, since the powerspectrum of noise is estimated using a weighted noisy speech powerspectrum, the power spectrum of noise can be estimated accuratelyregardless of the nature of noise, thus producing enhanced speech withreduced distortion and noise. According to the present invention,furthermore, because noise is suppressed using a spectral gain modifieddependent on the value of an SNR (signal-to-noise ratio), it is possibleto produce enhanced speech with reduced distortion and noise withrespect to all SNR values.

1. A method of noise suppression, using a computer to carry out thesteps of: converting, by at least one processor, an input signal into afrequency-domain and determining a signal-to-noise ratio based on afrequency-domain signal; determining, by the at least one processor, aspectral gain based on said signal-to-noise ratio; correcting, by the atleast one processor, said spectral gain to produce a modified spectralgain; weighting, by the at least processor, said frequency-domain signalusing said modified spectral gain; and converting, by the at least oneprocessor, the weighted frequency-domain signal into a time-domainsignal to produce an output signal where noise has suppressed, whereinsaid step of determining a spectral gain comprises the step ofdetermining said spectral gain based on a modified signal-to-noise ratiowhich is produced by correcting said signal-to-noise ratio, and whereinsaid step of correcting said spectral gain to produce a modifiedspectral gain comprises the steps of: estimating, by the at least oneprocessor, an a-priori signal-to-noise ratio value across a frequencyspectrum, to obtain an estimated a-priori signal-to-noise ratio valueacross the frequency spectrum; receiving, by the at least one processor,the estimated a-priori signal-to-noise ratio value across the frequencyspectrum; demultiplexing, by the at least one processor, the estimateda-priori signal-to-noise ratio value to provide a demultiplexed a-priorisignal-to-noise ratio value to a plurality of frequency-dependentspectral gain modification units that cover different frequency rangesof the frequency spectrum; demultiplexing, by the at least oneprocessor, the spectral gain determined in the determining step toprovide a demultiplexed spectral gain to the plurality offrequency-dependent spectral gain modification units that coverdifferent frequency ranges of the frequency spectrum; and multiplexing,by the at least one processor, respective outputs of the plurality offrequency-dependent spectral gain modification units to obtain themodified spectral gain, wherein each of the plurality offrequency-dependent spectral gain modification units performs spectralgain correction over the respective different frequency ranges of thefrequency spectrum.
 2. The method of noise suppression according toclaim 1, wherein said step of determining a signal-to-noise ratiocomprises the steps of estimating noise using said frequency-domainsignal, and determining said signal-to-noise ratio using the estimatednoise and said frequency-domain signal.
 3. The method of noisesuppression according to claim 2, wherein said step of estimating noiseusing said frequency-domain signal comprises the steps of weighting saidfrequency-domain signal to determine a weighted frequency-domain signal,and estimating noise using said weighted frequency-domain signal.
 4. Themethod of noise suppression according to claim 3, wherein said step ofdetermining said weighted frequency-domain signal comprises the step ofdetermining a weight using said signal-to-noise ratio.
 5. The method ofnoise suppression according to claim 4, wherein said step of determiningsaid weighted frequency-domain signal further comprises the steps ofprocessing said signal-to-noise ratio with a nonlinear function todetermine a modified weight, and weighting said frequency-domain signalusing said modified weight.
 6. An apparatus for noise suppression, theapparatus embodied in at least one processor, the apparatus comprising:a signal-to-noise ratio calculator for converting an input signal into afrequency-domain and determining a signal-to-noise ratio using afrequency-domain signal; a spectral gain generator for determining aspectral gain based on said signal-to-noise ratio; a spectral gainmodification unit for correcting said spectral gain to produce amodified spectral gain; a multiplier for weighting said frequency-domainsignal using said modified spectral gain; an estimator for estimating ana-priori signal-to-noise ratio value across a frequency spectrum, toobtain an estimated a-priori signal-to-noise ratio value across thefrequency spectrum; and an inverse converter for converting the weightedfrequency-domain signal into a time-domain signal, wherein said spectralgain generator includes a modified signal-to-noise ratio calculator forcorrecting said signal-to-noise ratio to determine a modifiedsignal-to-noise ratio, wherein said spectral gain generator determinessaid spectral gain based on the modified signal-to-noise ratio which isproduced by correcting said signal-to-noise ratio, and wherein each ofsaid spectral gain modification units comprises: a first demultiplexerconfigured to receive an estimated a-priori signal-to-noise ratio valueacross a frequency spectrum, and to demultiplex the estimated a-priorisignal-to-noise ratio value to provide a demultiplexed a-priorisignal-to-noise ratio value; a plurality of frequency-dependent spectralgain modification units that cover different frequency ranges of thefrequency spectrum and that are configured to receive the demultiplexeda priori signal-to-noise ratio value; a second demultiplexer configuredto receive the spectral gain and to demultiplex the spectral gain toprovide a demultiplexed spectral gain to the plurality offrequency-dependent spectral gain modification units; and a multiplexerconfigured to multiplex respective outputs of the plurality offrequency-dependent spectral gain modification units and to output themodified spectral gain as a result thereof, wherein each of theplurality of frequency-dependent spectral gain modification unitsperforms spectral gain correction over the respective differentfrequency ranges of the frequency spectrum.
 7. The apparatus for noisesuppression according to claim 6, wherein said signal-to-noise ratiocalculator includes a noise estimation unit for estimating noise usingsaid frequency-domain signal.
 8. The apparatus for noise suppressionaccording to claim 7, wherein said noise estimation unit includes aweighted frequency-domain signal calculator for weighting saidfrequency-domain signal to determine a weighted frequency-domain signal.9. The apparatus for noise suppression according to claim 8, whereinsaid weighted frequency-domain signal calculator includes a secondsignal-to-noise ratio calculator for calculating a signal-to-noiseratio.
 10. The apparatus for noise suppression according to claim 9,wherein said weighted frequency-domain signal calculator furtherincludes a nonlinear processor for processing said signal-to-noise ratiodetermined by said second signal-to-noise ratio calculator with anonlinear function to determine a modified weight.
 11. The method ofnoise suppression according to claim 2, wherein said step of estimatingnoise using said frequency-domain signal comprises the steps of:storing, in an estimated noise memory, the estimated noise; receiving anoisy speech power spectrum obtained from the frequency-domain signal;determining a frequency-dependent signal-to-noise ratio based on theestimated noise obtained from the estimated noise memory and the noisyspeech power spectrum; processing, by a multiplexed nonlinear processor,the frequency-dependent signal-to-noise ratio to determine a weightingfactor; and multiplying the weighting factor output by the multiplexednonlinear processor, with the noisy speech power spectrum, to obtain aweighted noisy speech power spectrum.
 12. The apparatus for noisesuppression according to claim 7, wherein said signal-to-noise ratiocalculator includes a weighted noisy speech calculator that comprises:an estimated noise memory configured to store the estimated noise outputby the noise estimation unit; a multiplexed multiplier configured toreceive a noisy speech power spectrum obtained from the frequency-domainsignal; a frequency-dependent signal-to-noise ratio calculatorconfigured to determine a frequency-dependent signal-to-noise ratiobased on the estimated noise obtained from the estimated noise memoryand the noisy speech power spectrum; a multiplexed nonlinear processorconfigured to process the frequency-dependent signal-to-noise ratiooutput by the frequency-dependent signal-to-noise ratio calculator todetermine a weighting factor; and a multiplexed multiplier configured tomultiply the weighting factor output by the multiplexed nonlinearprocessor, with the noisy speech power spectrum, to obtain a weightednoisy speech power spectrum.
 13. A method of noise suppression, using acomputer to carry out the steps of: converting, by at least oneprocessor, an input signal into a frequency-domain signal; andestimating, by the at least one processor, noise based on saidfrequency-domain signal; wherein the estimating step comprises the stepsof: determining, by the at least one processor, a frequency-dependentsignal-to-noise ratio based on said frequency-domain signal;determining, by the at least one processor, a weight based on saidfrequency-dependent signal-to-noise ratio such that the weight is set toa value less than a predetermined weight when the frequency-dependentsignal-to-noise ratio is greater than a predetermined signal-to-noiseratio; weighting, by the at least one processor, said frequency-domainsignal with said weight to determine a weighted frequency-domain signal;and determining, by the at least one processor, the estimated noisebased on said weighted frequency-domain signal.
 14. The method accordingto claim 13, wherein said step of determining a weight comprises thestep of determining the weight from said frequency-dependentsignal-to-noise ratio with a non-linear function.
 15. The methodaccording to claim 13, wherein said step of determining afrequency-dependent signal-to-noise ratio comprises the step ofdetermining said frequency-dependent signal-to-noise ratio based on saidfrequency-domain signal and a previous estimated noise.
 16. The methodaccording to claim 13, wherein the step of determining the estimatednoise based on said weighted frequency-domain signal comprises the stepof determining said estimated noise from said weighted frequency-domainsignal with moving-averaging.
 17. The method according to claim 13,further comprising the step of: suppressing noise in saidfrequency-domain signal based on the estimated noise.
 18. The methodaccording to claim 17, wherein said suppressing step comprises the stepsof: determining a signal-to-noise ratio based on said frequency-domainsignal and said estimated noise; and determining a spectral gain basedon said signal-to-noise ratio, wherein said suppressing step comprisesthe step of weighting said frequency-domain signal with saidsignal-to-noise ratio to suppress noise.
 19. An apparatus for noisesuppression, the apparatus embodied in at least one processor, theapparatus comprising: a converter configured to convert an input signalinto a frequency-domain signal; a noise estimation unit configured toestimate noise across a frequency spectrum based on saidfrequency-domain signal; wherein said noise estimation unit comprises: afrequency-dependent signal-to-noise ratio calculator configured todetermine a frequency-dependent signal-to-noise ratio based on saidfrequency-domain signal; a weight calculator configured to determine aweight such that the weight is set to a value less than a predeterminedweight when the frequency-dependent signal-to-noise ratio is greaterthan a predetermined signal-to-noise ratio; a weighted frequency-domainsignal calculator configured to weigh said frequency-domain signal withsaid weight to determine a weighted frequency-domain signal; and anestimated noise calculator configured to determine the estimated noisebased on said weighted frequency-domain signal, wherein saidfrequency-dependent signal-to-noise ratio calculator determines saidfrequency-dependent signal-to-noise ratio based on said frequency-domainsignal and a previous estimated noise.
 20. The apparatus according toclaim 19, wherein said weight calculator comprises a nonlinear processorconfigured to the weight from said frequency-dependent signal-to-noiseratio.
 21. The apparatus according to claim 19, wherein said estimatednoise calculator comprises a moving-average calculator configured todetermine said estimated noise from said weighted frequency-domainsignal with moving-averaging.
 22. The apparatus according to claim 19,further comprising: a noise suppressor configured to suppress noise insaid frequency-domain signal.
 23. The apparatus according to claim 22,wherein said noise suppressor comprises: a signal-to-noise calculatorconfigured to determine a signal-to-noise ratio based on saidfrequency-domain signal and said estimated noise; a spectral gaingenerator configured to determine a spectral gain based on saidsignal-to-noise ratio; and a weighting unit configured to weight saidfrequency-domain signal with said signal-to-noise ratio to suppressnoise.
 24. The method according to claim 1, wherein the step ofestimating an a-priori signal-to-noise ratio value across a frequencyspectrum comprises the steps of: receiving and storing, in a firstmemory, an a-posteriori signal-to-noise ratio across the frequencyspectrum; receiving and storing, in a second memory, the spectral gain;performing multiplexed range limitation processing to the a-posteriorisignal-to-noise ratio across the frequency spectrum; performing, by afirst multiplexed multiplier, a multiplexed multiplication of thespectral gain stored in the second memory in order to obtain and outputa squared spectral gain; performing, by a second multiplexed multiplier,a multiplexed multiplication of the squared spectral gain output by thefirst multiplexed multiplier and the a-posteriori signal-to-noise ratioacross the frequency spectrum stored in the first memory, to obtain andoutput a past estimated signal-to-noise ratio value; and adding a weightto the past estimated signal-to-noise ratio value output by the secondmultiplexed multiplier to obtain and output an estimated a-priorisignal-to-noise ratio value across the frequency spectrum.
 25. Theapparatus according to claim 6, further comprising: a first memoryconfigured to receiving and store an a-posteriori signal-to-noise ratioacross the frequency spectrum; a second memory configured to receive andstore the spectral gain; a multiplexed range limitation processorconfigured to perform multiplexed range limitation processing on thea-posteriori signal-to-noise ratio across the frequency spectrum; afirst multiplexed multiplier configured to perform a multiplexedmultiplication of the spectral gain stored in the second memory in orderto obtain and output a squared spectral gain; a second multiplexedmultiplier configured to perform a multiplexed multiplication of thesquared spectral gain output by the first multiplexed multiplier and thea-posteriori signal-to-noise ratio across the frequency spectrum storedin the first memory, to obtain and output a past estimatedsignal-to-noise ratio value; and an adder configured to add a weight tothe past estimated signal-to-noise ratio value output by the secondmultiplexed multiplier to obtain and output an estimated a-priorisignal-to-noise ratio value across the frequency spectrum.